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Abstract 



Let (S^dgi) be the unit circle in R 2 endowed with the arclength distance. We give a sufficient 
and necessary condition for a general probability measure /i to admit a well defined Frechet mean 
on (S^dgi). We derive a new sufficient condition of existence P(a,ip) with no restriction on the 
support of the measure. Then, we study the convergence of the empirical Frechet mean to the 
Frechet mean and we give an algorithm to compute it. 

Keywords : circular data, Frechet mean, uniqueness. AMS classification: 62H11. 

1 Introduction 

1.1 Statistics for non-Euclidean data. 

(N 

In many fields of interest, results of an experiment are objects taking values in non-Euclidean spaces. A 
qq [ rather general framework to model such data is Riemannian geometry and more particularly quotient 
Q\ " manifolds. As an illustration, in biology or geology, directional data are often used, see e.g. [14] or [6] 
and references therein. In this case, observations take their values in the circle or a sphere, that is, an 
Euclidean space quotiented by the action of scaling. 

The usual definitions of basic statistical concepts were developed in an Euclidean framework. 
Therefore, these definitions must be adapted for random variables with values in non-Euclidean spaces 
such as manifolds. To describe the localization of a probability distribution, one needs to define a 
central value such as a mean or a median. There has been multiple attempts to give a definition of a 
mean in non Euclidean space, see among many others [2, 4, 12, 13, 8, 5, 16] or [7]. 

In this paper, we consider the so-called Frechet mean, see [7, 8, 12] or [2] and references therein. 
We are particularly interested in the study of its uniqueness. The Frechet mean is defined on general 
metric spaces by extending the fact that Euclidean mean minimizes the sum of square the distance to 
the data, see equation (1.3) below. To study the well definiteness of the Frechet mean on a manifold, 
two facts must be taken into account: non uniqueness of geodesies from one point to another (existence 
of a cut locus) and the effect of curvature, see e.g. [4] for further discussion. Due to the cut-locus, the 
distance function is no longer convex and finding conditions to ensure the uniqueness of the Frechet 
mean is not obvious. Two main directions have been explored in the literature: bounding the support 
of the measure in [3] for the n-spheres and in [8, 10, 12, 1] for manifolds, or consider special cases 
of absolutely continuous radial distributions, see [9] for the unit circle and or [10, 11] for projective 
spaces. In a sense, these two conditions control the concentration of the probability measure. The 
philosophy behind these works is to ensure a convexity property of the Frechet functional given by 
equation (1.2) below, see e.g. the introduction of [1] for a review of the above cited papers. 
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1.2 Frechet mean on the circle 



A standard way to extend the definition of the Euclidean mean in non-Euclidean metric space is to use 
the minimization property of the Euclidean mean. This definition is usually credited to M. Frechet in 
[7] although some authors credit it to E. Cartan, see e.g. [10]. Let S 1 be the unit circle of the plane, 

g 1 = { x \ + x 2 = 1} X2 ) G M 2 }. 

endowed with the arclength distance given for all x = (x\,X2),p = (pi,P2) 6 S 1 by 

/\\ x — >p\\ \ 

d§i(x,p) = 2arcsin , (1.1) 



where \\x — p\\ = y (xi — p\) 2 + (x2 — P2) 2 is the Euclidean norm in M 2 . In the whole paper, // is a 
probability measure on S 1 and the Frechet functional is defined for all p € S 1 by 

W = \ [ d 2 (x,p)d f i(x). (1.2) 

The Frechet functional is Lipschitz since by the triangle inequality we have \F^(pi) — F^(p2)\ < 
2Trd§i(pi,p2) for any p±,p2 £ S 1 . Thus, attains its minimum in at least one point and the only 
issue at hand is uniqueness. 

Definition 1.1. We say that the Frechet mean of a probability measure [i in (S^dgi) is well defined if 
F^ admits a unique argmin. That is, there exists a unique p* £ S 1 satisfying F^(p*) = min pg jvf F^(p), 
and we note 

p* = argmin F )1 (p) . (1-3) 

pes 1 

The argmins of F^ are also called Riemannian center of mass [16] or intrinsic mean [2] as the 
(S 1 ,^) is a simple one dimensional compact Riemannian manifold. The advantage of dealing with 
a simple object such as the circle is that curvature problems disappear and we only face the cut- 
locus problem. In this sense, it allows us to completely understand its effect on the non-convexity 
of the distance function d§i, and to give a complete answer about the problem of uniqueness. In 
what follows, we fully characterize probability measures that admit a well defined Frechet mean on 
the circle (S , dgi). In particularly a necessary and sufficient condition is given in Theorem 4.1, which 
links the existence of a Frechet mean for a measure [i to the comparison between the distribution /x and 
the uniform measure A on S . The surprising fact is that A appears as a benchmark to discriminate 
measures having a well defined Frechet mean. The uniform measure A is the 'worst' possible case as 
all points of the circle is a Frechet mean, indeed the Frechet functional (1.2) is constant and equals to 

3 • 

In opposition to what have been done before we do not try to ensure convexity property on 
the Frechet functional. Indeed, the definition of the Frechet mean relies on the global optimization 
problem (1.3) which is, in general, non convex. The advantage of our approach is that we do not need 
to restrict the support or suppose restrictive conditions of symmetry on the density. As the geometry 
of flat manifold is simple, we can derive explicit form on the Frechet functional and its derivative 
which can be hard to compute in non-flat manifolds such as n-dimensional spheres. 



1.3 Organization of the paper 

In Section 2, we introduce notations that will be used throughout the paper. In Section 3, we give 
explicit expressions for the Frechet functional and its derivative and we discuss some properties of 
critical points of the Frechet functional. Section 4 contains the main result with the necessary and 
sufficient condition of Theorem 4.1 for the existence of the Frechet mean for a general measure. We also 
propose a new sufficient criterion P(a, ip) that ensures the well definiteness of the Frechet mean. In 
Section 5, we study the convergence of the empirical Frechet mean to the Frechet mean, and describe 
an algorithm to compute the empirical Frechet mean. 
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2 Notations 



In what follows, 1^ denotes the indicator function of the set A C R and the notation f(t)d/j, po (t) 
stands for the Lebesgue integral Jj Q b j f(t)d/j, po (t) if a < b and Jj 6 a j f(t)dfi po (t) if 6 > a. 

2.1 Normal coordinates 

Given a base point p G S 1 , there is a canonical chart called the exponential map defined from TpS 1 ~ R, 
the tangent space of S 1 at p, to S 1 and denoted by 

e„:R — ► S 1 , /cos(6>) -sin(0) 

where Kg - 



i — > e p (6) = R e p, y \sm(6) cos(6) 

This map is onto but not one to one as it is 27r-periodic. To guaranty the injectivity, we choose to 
restrict the domain of definition R of e p to [— ir,ir[. Thus, for all p±,P2 G S 1 there is a now unique 
Opl G [— vr,7r[ satisfying e Pl (6 P l) = p2 and 

e p : [— 7r,7r[ — > S 1 and e^ 1 : S 1 — > [— ir,n[, for all p G S . 

Such parametrizations are called normal coordinates systems centered at p and 0^ is nothing else but 
the coordinate of p2 read in a normal coordinate system centered in p\. To simplify the notations, we 
will omit the exponent p\ if no confusion is possible and we will write 9 P \ = 6 P2 . 

The cut locus of a point pq G S 1 is denoted by p~o and is equal to the opposite point (in R 2 ) of 
Po, that is po = —po- In a normal coordinates system centered at p G S , the coordinate of fig is 
d p = °po - * if < 9 P P0 < vr or p po = p Po + tt if tt < ^ < 0. 

2.2 Distance function and probability measures and the Frechet functional 

The arclength distance between two points p±,P2 G S 1 was defined in (1.1). Given normal coordinates 
pi ,8 P2 G [— tt, 7r[ of these points in the chart centered at an arbitrary point , 

cfei (Pi , P2 ) = d R/2nZ (9 P1 ,6 P2 ) : = min{ 1 9 P1 — 9 P2 + 2nk\ , k G Z}. 

This means that the circle S 1 is locally isometric to the real line R. 

Unless specified, will denote a general probability measure on (S , ^(S 1 )) where ^(S 1 ) is the Borel 
set of S 1 C M 2 . Given a point po G S , fi Po is the image measure of /i through e~ : S 1 — > [—tv,tt[. 
This is a measure on R with a support in [— tt, tt[ which is defined by 

fi po (A) = /j,o e po (A Pi [— 7T, 7r[) , for all A e B(K), (2.1) 

where i3(R) is the Borel set in R. The usual Euclidean mean/expectation and variance of /i po are 
denoted 

m(Vpo) = / tdnp (t). 
Jr 

Finally, let us introduce for any po G S , the map F^ po : [—tt,tt[ — > R given by 



d 



R/2nz( t '®)df J 'po(' t ) 



2 l/^t-^/ipoW + W^, 



9) 2 d/i po (t), if0<6»<vr, 
6) 2 dn P0 (t), if - vr < 6* < 0. 
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3 The Frechet functional on the Circle 



3.1 The derivative of the Frechet functional 

A function / : [— tt,tt[ — > R is said left continuous on [— tt,tt[ if it is left continuous everywhere on 

1 — 7r,7r[ and with lirm._5.Q- f(n + e) = /(— tt). Similarly, / is said to be continuous on [— ir,ir[ if it is 
left and right continuous on [— tt,tt[. We provide here an explicit expression of the derivative of Fa, 

Proposition 3.1. Let fj, be a probability measure on (S 1 ,^) and fix an arbitrary po G S . Then, 
Fn : S 1 — > IR is differ •entiable in following sense : 

1. Letp G S 1 be a point with a cut locus of [i-measure 0, i.e p}) = 0. Then F^ is continuously 
differentiate at 9 P ° and we have 

de ^p) \^o + 27r/i po ([7r + ^,7r[)-m(A tpo ) J if - tt < 9 P P ° < 0. 

2. The function -^F^ is left continuous on [— tt, tt[. Then we extend the definition of the derivative 
of F^ by setting for all 9 g] — tt, tt[ 

_W«=- .Jig. _*"*.<» + '>. (3 ' 2) 

and W F n„ HO = w F f K ( T + 

3. Let p G S 1 be a point with a cut locus of positive measure, i.e /**({— _}) > 0. Then, p is a cusp 
point of F^ in the sense that lim £ ^ - -^F UpQ (9^ + e) - lim £ _ s , + W F /h-o^p° + e ) = 

Note that the left-continuity comes from our convention on the exponential map which is defined on 
[— tt, 7r[. If a measure [i is such that = for all p G S 1 then F^ is of class C 1 on [— 7r,7r[. 

Differentiability issues appear when the measure ix has atoms, see Figure 1. 

Proof. For convenience we omit in this proof the superscript po by writing 9 P = 9 P ° for all p G § . In 
the coordinate system centered at po we have for all 9 P G [—tt, tt[ 

F » P0 ( 9 p) = \f t 2 d» P0 (t) ~ e p m(v P0 ) + \ol + 2tt( 5 + o (0 p )1 [0A (6 p ) + ^ (0 p )l h7r , o[ , (3.3) 

where 0^(0) = + t - 9)dii po (t) and 5~ o (0) = /^(tt - t + 9)dfi po (t). Hence, to prove 

Proposition 3.1, we just have to study the derivative of g^ and ■ 
For all # p g]0, tt[ and e G R such that 9 p + e G]0, 7r[ we have, 

7 (<?Xo ^ + £ ) " ff X M =7 f " +e?>+ V + * - Wpo (t) - [ * +8p+£ dfi po (t) (3.4) 

The limit from the left in equation (3.4) is lim e ^ - _ (d^ p (Op + e ) — 9^ P0 (Op)) = — ^po([ — 7r ' — ^ + ^pD 
when < 9 P < tt. The (left) derivative at 9 P = is given by lim^Q- ^ f (#p + e ) — <?p P0 (0p)J = 0- 

Similarly, if — tt < 6 P < 0, we have lim e _;. - -(<7^ p (0 p + e) — 9^ P0 {9 p )) = li po ([tt + p ,tt[) and statement 

2 is proved. 

To prove statement 1, suppose that the cut locus of p is of [i- measure 0. In this case, the limit from 
the left and from the right in equation (3.4) are equal as lim e _>o \ fZ^g P+e ( 7T + t ~ Qp)dpi po (f) = 
since n Po ({9 p - tt}) = 0. Thence, formula (3.3) yields -^g Ppa (9 p ) = -h Po ([-tt, -tt + 9 P [), if 9 P G 
[0,tt[ and ^g^ o (9 p ) = fj, po ([iT + P , tt[), if P G [-vr,0[. 

Finally, suppose that p G S 1 has a cut locus of positive measure. If < 9 P < tt, it means that 

Mpo({0p ~ 7r » > aIld theI1 ' W F Hp ( 6 p) ~ lim e^0+ m F H P0 ( 9 P + e ) = ~ lim e^0+ ^Po([-7T + 9 p , -TT + 9 p + 

e[) = — n Po ({— tt + 6 p }). The case — tt < 9 p < is similar and the proof of statement 3 is completed. □ 
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Figure 1: Let \i = \8 Pl + |<5 P * + \o~ P2 with p\ = R^p* and P2 = R_2n.V* ■ I n blue: „ . In green: 
A TP 



3.2 Local minimum of the Frechet functional 

The critical points of the Frechet functional are the points at which the derivative of Fu, in the sens 
of Proposition 3.1, is 0. As a immediate consequence of equation (3.1) we have 

^M°) = - m ^v) (3-5) 

Thus, the critical points are precisely the exponential barycenters (i.e. points p G S 1 at which m(fj, p ) = 
0). This fact was already shown in [5] or [12] for general manifolds. Note that a critical point of the 
Frechet functional is not in general an extremum (see an illustration Figure 1) but it is worth noticing 
that the minima of the Frechet functional are regular points in the sens of the following result, 

Corollary 3.1. Let \i be a probability measure on S 1 . The cut locus of a (local or global) minimum 
of Fp is of ^-measure 0. 

Proof. Let p G S 1 be a point satisfying fi({—p}) > 0. For any po G § , Statement 3 of Proposition 
3.1 ensure that the derivative ^F^ of the Frchet functional has a negative jump at dp . Hence, the 
signs of (lim^o- ^F^ Q + e), lim £ ^ + as^ P0 (C + e)) ™ either (+,+), (+,-) or (-,-). This 
means that 9p° cannot be a minimum of F^ since it would correspond to the case (—,+). □ 

Remark 3.1. Note that assumptions of Corollary 1 in [17] and Theorem 1 in [2] contain a condition 
of the form /J>({p}) to ensure (classical) differentiability of the Frechet functional at its minimum. In 
the case of the circle, Corollary 3.1 shows us that the Frechet functional is automatically differentiable 
at its minima. 

4 Necessary and sufficient condition for the existence of the Frechet 
mean 

4.1 Main result 

Theorem 4.1. Let fx be a probability measure andp* G S 1 be a critical point of F^. Then, the following 
propositions are equivalent, 

1. p* is a well defined Frechet mean o/(S 1 ,//) . 

2. For all < 9 < vr 

e 

A([-tt, -7T + 1[) - n P *([-ir,-ir + t[)dt > 0, 
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and for all —tt < 9 < 0, 




A(]vr + t, vr[) - /i p . (]tt + 7r[)dt > 0, 



where A is i/ie uniform measure on [— tt,tt[ and [X v * is defined in (2.1). 

Theorem 4.1 gives a necessary and sufficient condition for the existence of the Frechet mean of 
a general measure [i on the circle S 1 . This condition is given in terms of comparison between the 
/i-measure and the uniform measure A of balls centered at the cut locus of a global minimum. The 
important point is that the ^-measure of a (small) neighborhood of the cut locus of p* cannot be 
larger than the uniform measure of this neighborhood. 

As fj, is a probability measure, the functions t i — > A([— tt, —tt + t[) — (i p *([—tt, —tt + t[) and t \ — > 
XQtt — t, 7r[) — fj> p * (]tt — t, 7r[) do not need to be always positive for t £ [—tt, 0[ and t G [0, tt[ respectively. 
An example where this quantity is always positive is when \i admits a density which is a decreasing 
function of the distance to a point p*, see [9]. In this case, the density is radially distributed around its 
mode p* which is, by Theorem 4.1, the Frechet mean of \i. Many classical probability distributions used 
in circular data analysis follow this framework: von Mises distribution, wrapped normal distribution, 
geodesic normal distribution [17] , etc... 

Another well-known example of distributions that admit a well defined Frechet mean is distribu- 
tions with support included in an hemisphere, see [3]. More precisely, suppose that there exists a point 
p 6 S 1 with n({p £ S 1 , ci§i (p,]5) < 5}) = 1 and fi({p £ S^dgi^p) < £}) > 0. In this case, Statement 
2 of Theorem 4.1 holds since one can show that the minimum p* of is in {p £ S 1 , d§i (p,p) < f } and 
that F^{0)-F^{Q) > i^+vr-2%) 2 for 9 £ [-vr.^-f [, F^(9)-F^(0) = f for 9 £ [^-|,^+|[ 
and Fp t (9) — F^ (0) > \(9 — tt — 29p) 2 for all 9 £ [6p + f , vr[. The case of equality corresponds to 
distributions with support in the boundary of the hemisphere, that is fi p * = (1 — £)S ftP * + e5 ftP * E 

p 2 p + 2 

with e = ^Op + g and in this case, there are two global argmins at and 29p ± 7T, see Figure 2. 




(a) (b) 
Figure 2: Let {j, e = (1 - e)5 e _| + £<5 e+ | with e = f + |. (a) F Me with 9 = 0. (b) F Me with 6> = 

4.2 Proof of Theorem 4.1 

As already noticed, there exists at least one global argmin p* £ S 1 of F„ since is a continuous 
function defined on the compact set S . Moreover, Proposition 3.1 and Corollary 3.1 ensure that p* 
is a regular critical point of F^, i.e. a zero of the derivative with a cut locus of ^-measure 0. 

In the normal coordinate system centered at p* the functional „ (9) = F^ pt (9) — F^, (0) has a 
particularly simple expression thanks to equation (3.5). Indeed, we have 

G^(0) = b 2 + 2TTt [0M (9) [ ( 7T + t-9)dp p »(t) + 2TTt h7Tfi[ (9) [ (TT-t + 0)dfl p *(t). 
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It is clear that Statement 1 is equivalent to the fact that the unique zero of t is 0. Thence, 
Theorem 5.1 will be an easy consequence of Lemma 4.1 below since we have A([— 7T, —tt + 1[) = for 
all < t < tt. □ 

Lemma 4.1. Let fj, be a probability measure on S 1 and p* G S 1 be an argmin of F^. Then for any 
9 G \-tt,it\ 



G„J9) = 2tt{ 



r t 

/ 7r ~ tip*([-7T,-7T + t[)dt, ifO<9<lT, 

Jo 2vr 

r° —t 

/ n fJ, p *([iT + t,ir[)dt, if-TT<0<O 

Je ^ 



Proof. The probability measure \x can be decomposed as follow, 

fi = an d + (l-a)fi 5 , 0<a<l, (4.1) 

where fi d is a probability measure such that fJ>d({p}) = for all p £ S 1 and [is = 2~2t^. u} j^Pj where 
Y^j^o^j = 1 an d the pj's are in S 1 . Hence, we consider the two cases separately : first, when the 
measure is non atomic, and then, when it is purely atomic. The general case follows immediately in 
view of equation (4.1). 

First, assume that /i is an atomless measure of S 1 . Proposition 3.1 ensures that F^ is continuously 
differentiable everywhere and the real function * is of class C 1 on [— tt, tt[. Formula (3.1) and the 
fundamental theorem of calculus gives for all 9 G [—tt, tt[ 

Jo \/°pp.([ir + t,ir[)(i(, if -7r<«<0. 

Consider now the case where (i is a purely atomic measure. First, we treat the case where the number 
of mass of Dirac in the sum is finite, i.e /i = X^j^i^j^Pj' n £ N . Recall that F^, is a Lipschitz 
function on [—tt,tc[. Proposition 3.1 ensures that the derivative is piecewise continuous and formula 
(3.1) holds for all 9 G [— tt, 7t[\{%j}?=i ; i.e points that have a cut locus of /^-measure 0. Hence for all 
9 G [— tt,tt[, equation (4.2) holds too. 

To treat the case where n = ^2j^ojj5 Xj we proceed by approximation. Let (p(n) = {j G 
N | ujj > <^r} and remark that Card(</>(n)) < +oo for all n G N since X]j=i w j = 1- Then 
if v p* = ctn) ^je^n) ^j^j ' wnere c(n) = Ylj^n)^] is a normalizing constant, we have for all 
6 G [-tt,tt[, 

G vn{ 9)= [\dt-2J^-^ + t ^ lf * 6 <*> 

p * JO \/° I/£([tT + t, TT[)dt, if - TT < 9 < 0. 

The sequence (fp*) n >i converges to /U in total variation. By the dominated convergence Theorem for 
all 9 P G [— tt,tt[, G u n :t {9) converge as n — > oo to (4.2). □ 

4.3 The criterion P(a, if) 

Although the necessary and sufficient condition of Theorem 4.1 is a key step to understand the problem 
of non uniqueness of the Frechet mean on S 1 , it is of little practice interest: we have to know a priori 
a critical point p* of the Frechet functional. In this section, we derive sufficient conditions of existence 
with no restriction on the support of the probability measure and that are easily usable. Let us 
introduce the following definition : 

Definition 4.1. Let f : S 1 — > K + be a probability density, p G S , a G]0, 1] and (p G]0, tt[. We say 
that f satisfies the property P{p,a,ip) if for all \9\ > (p 

f P {9) < (4.3) 
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Figure 3: An example of distribution satisfying P(a, ip). Plot of the density f p in blue with the bounds 
P(p, 0.1, 2) in yellow and P(p, 0.5, 1.6) in red. 



where f P = f°e p : [—it, 7t[ — > R + . Moreover, we say that f 6 P (a, ip) if there is a p £ S 1 suc/i that f 
satisfies P{p,a,ip). 

The parameters a and </? control the concentration of [i around p. The idea is to control the mass 
lying in the complementary of the ball B(p, <p), see Figure 3. We have the following properties : 

Lemma 4.2. Let f : S 1 — > R be a probability density on the circle. Then 

1. P(p,ati,<pi) => P(p,a 2l ip 2 ) if ai > a 2 and <pi < ip 2 . 

2. Letpx.pi G S 1 and tp < § . Ifd § i(p 1 ,p 2 ) < ir - tp then P(pi,a,tp) P(p 2 ,a,tp + d § i(pi,p 2 )). 

3. If f satisfies P(p, a, ip) then \m(fi p )\ < ip + ^^-(tt — ip) 2 . 

Proof. The first proposition is obvious in view of the Definition 4.1. 

To prove the second claim suppose that < 9 P l = —9 P P \ < ir — <p (the other case is similar) and 
write 

f P M = f P1 (& + eii)i { _^_ e v l{ {e) + f Pl (e + 6g - 27r)i [7r _^ V[ (0). 

In particular, it implies that ir > ir — 9 p } 2 > ip and since P(p\,a,tp) holds, we have f P2 {9) < ^ 
9 > p — 9 P2 or 9 < —<p — 9 P2 . This is equivalent to the fact that P(p 2 , a, min{| — <p — 9 P2 |, \<p — 9 P \ | }) = 
P(p 2 , a, (p + 9 p P l) holds. The case tp — ir < 9 P 2 < is similar and we have P(p 2 , cx,(p — 9 P2 ). Finally 
recall that \9 P l\ = d§i(pi,p 2 ) and the property is proved. 

To show the last claim, we only need to consider the case where \x p has its support on [0,7r[. 
Indeed, fi p = ojpL~ + (1 — w)//+ where /i~ ([—ir,0[) = /i+([0, 7r[) = 1 and < oj < 1. It yields that 
m{ji p ) = f R td(uifi p + (1 — ^)fi p ) = f R+ td(—ujpL~ + (1 — w)/i+) < f R+ td(i£. Then, if the density f p of 
H p has its support in [0,7r[ we have 

m(fi p ) < tp (l - J f p (t)dtj + J tf p (t)dt <tp + f (t- <p)dt, 

which gives the result. □ 

If the density / is sufficiently concentrated around a critical point p* of then, this point is the 
Frechet mean of fi. More precisely we have the following result, 

Proposition 4.1. Let fi be a probability measure with density f : S 1 — > R + and p* E 8 1 be a critical 
point of F^. If f satisfies P{p* , a, ip) with a G]0, 1] and < (p < (p a = ir then, ^ admits a well 

defined Frechet mean at p* . 
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For all a e]0, 1], we have <po = < <p a < ^ = <p\. Note that if a = 1 then \x has its support 
included in the ball B(p* , |) = {p £ S, <i§i(p*,p) < f } and when a < 1, the support of /i can be the 
entire circle S . 

Proof. Let „ (0) = , (9) — F^ „ (0) for all 6> G [— 7r, 7t[. As the measure /x admits a density /, G Mpt 

is twice differentiable and equation (3.1) implies that -^G^ p , (9) = 1 — 27r/(— -zr + 0), if < < 7r, 

and J^jG^ * (0) = 1 — 27r/(7r + 0), if — 7r < < 0. Since / E P(p*, a, V 3 )] the function G^, is convex 
on [—Tr + ip,TT — ip] and has a unique minimum at 0. Let us show that is the only argmin of Fp » on 
[— 7r, 7r[. If 9 £ [-zr — </?, 7r[, we have thanks to Lemma 4.1 

(0) = Gp pt (Tr-ip)+ f t- 27r/i p «([-7r, -vr + *[)ctt. 

./ 7T— if 

Since / € P (p*, a, 99) we have G^ , (vr — > ^(7r — (/?) 2 and the second term is bounded from below 
by t — 2ttu([—tt, —tt + t[)dt where v = \(5-y + 6^). It yields for 9 £ [it — tp, tt[, 

G^(0)>^(a(7r-^) 2 -^ 2 ). (4.4) 

The right hand side of (4.4) is strictly positive if (p < (p a = tt Similarly, the same condition 

implies G^ p , (9) > for 9 G [— 7r, -7r + <p[. □ 

We are now able to define a functional class of densities that admit a well defined Frechet mean 
without restriction on the support of the measure. 

Theorem 4.2. Let < 8 < \ be a parameter of concentration and fi a probability measure with 
density f E P(a,(p) (see Definition 4-1) with 

as < a < 1 and ip < 5</? a 

where as be the square of the root of (5 — 65 + 5 2 )X 3 + (1 — 5 2 )X 2 — (28 + 1)X — 1 that lies in ]0, 1] 
and tp a = 7T TTien /i admits a well defined Frechet means. 



Firstly, remark that there is no need to know a critical point a priori. Secondly, the parameter 8 
controls the concentration of / via the inequality as < a and p < 6<p a . There is a tradeoff between 
a and the possible value of cp: the smaller a is (i.e the less / is concentrated) the smaller ip must be 
(i.e we need to control the value of the density on a bigger interval). In Tabular 1 we give examples 
of numerical values. Note that the column corresponding to 8 = is given as a reference only as the 
set P(as,5ip a ) is empty for this values of 5. 



8 = 





1 

10 


l 

5 


1 

3 


1 

2 


as < 


0.39 


0.46 


0.54 


0.69 


1 


8 ¥a s > 





0.12 


0.26 


0.47 


TT 

4 



Table 1: Some values of a$ and 5ip ag depending on 8 E]0, ^[. 



Proof. We show that under the hypothesis of the Theorem 4.2, there is a critical point p* of F^ 
satisfying dgi(p,p*) < (1 — 5)<p a where p G 8 1 is a point satisfiying / £ P(p, a, ip). Thence, by Lemma 
4.2, / belongs to P(p*, a, 8<p a + (1 — 5)<p a ) = P(p* , a, ip a ) and Proposition 4.1 will ensure that p* is 
the Frechet mean of /i. 
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In the rest of the proof, we show that there is a p* € S 1 such that -^gF^iOp*) = with d s i(p,p*) < 
(1 — 5)ip a . To this end, suppose that m(/z p ) > (the case m(fj, p ) < is similar). Equation (3.5) 
implies that ^gF^ (0) = —m(fi p ) < 0, and let us check that, under the hypothesis of Theorem 4.2, we 
have jffiFp ((1 — 8)<p a ) < 0. Since ^F^ is a continuous function, the intermediate value Theorem 
will ensure the existence of a critical point p* such that \6~* \ < (1 — 5)<p a . Equation (3.1) gives 



d 

d6 



Ffip ((1 - 8)<p a ) = (1 - 5)<p a - 2-TTflp ([-7T, -7T + (1 - 5)if a [) - Tn(jJLp). 



We have — 2irp, p (\— n, — 7r+(l— S)(p a [) > (a— 1)(1— 8)<p a since / G P(p, a, 5ip a ) and |— tt + (1 — 5)ip a \ > 
5ip a . Moreover, — m(/i) > 5ip a — (tt — 5ip a ) 2 by Lemma 4.2 Statement 3. It gives, 

d „ ,„ „ w (5 - 65 + ^ 2 )a^+(l-,5 2 )a- (25 + 1)^-1 
^ ((1 " *)Va) > fT - 7 = 

This quantity is positive as soon as 1 > a > as, where ^/a~s is the root of the polynomial X i — > 
(5-65 + 5 2 )X 3 + (1 - S 2 )X 2 - (25 + 1)X - 1 that lies in 10, 11. It is easy to see that ai = 1 and 

2 

numerical experiments show that the (increasing) function 5 i — > as takes its value in ]0.39, 1[ for 

*e]o,|[. □ 

5 Prechet mean of an empirical measure 
5.1 Existence 

Let X\ , . . . , X n be independent and identically distributed random variables with value in (S 1 , d§i ) 
and of probability distribution fi. The empirical measure is defined as usual by 



n 



n 

and we note p* n the empirical Frechet mean defined as the unique argmin of F^n (p) = ^ X^=i (P> 
p 6 S 1 . In [18] a strong law of large number is given for the empirical Frechet mean in a semi metric 
space which is the case of (S^dgi). In particular, if p* n exists for each n 6 N, the empirical Frechet 
mean is a consistent estimator of the Frechet mean. Indeed, the empirical Frechet mean is well defined 
almost surely for a wide class of probability measures as the following fact from [2] Remark 2.6 shows, 

Lemma 5.1. Let fi be a non atomic probability measure on the circle, i.e satisfying = for 

all pGS 1 . Then for all n € N the empirical Frechet mean exists almost surely. 

Hence, the empirical Frechet mean p* of a probability measure \x can be computed even if \i does 
not possess a well defined Frechet mean. 

5.2 Consistency 

If the Frechet mean p* of /i is well defined, we study the rate of convergence of the empirical Frechet 
mean p* n to p* . 

Proposition 5.1. Let \i be a measure with density f : S 1 — > M that admits a well defined Frechet 
mean p* . Then, there exists a strictly increasing function p :]0, n[ — >]0, +oo[ such that for all p 6 S 1 , 
p^p* 

i^(p)-J^(p*)>p(dsi(p,p*)). 
If Pn denotes the empirical Frechet mean, we have for all x > 

p(d sl (p* n ,p*))>C(s)^] <2e~*. (5.1) 

where s = max{|x — y\ , x,y G support(/i)} and C(s) = (Air 2 + Att 2 s + 2s) < 47r(2-7r 2 + tt + 1). 
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The function p in the statement of the Proposition 5.1 determines the rate of convergence of p* to p* . 

Proof. The first claim about the lower bound p is a direct consequence of the following Lemma: 

Lemma 5.2. Let f : [— tt,tt[ — > R + be a continuous function on [— tt,tt[ (see Section 3). If f vanishes 
at a unique point 9q G [— tt, tt[, then there exists a strictly increasing function p :]0,7r[ — >-]0, +oo[ such 
that for all 9 G [-vr, vr[\{0}, 

f(9)>p(d sl (9 ,6)). 

Proof. As / is the restriction on [— tt, tt[ of a continuous periodic function we can assume, without loss 
of generality, that 9q = 0. Then, define for all 9 g]0, tt[ 



p{9) 



1 

W\ 



g(t)dt, 



where g(t) = min <^ f(-t),f(t), min f(r] 

\r\>t 



for all t G f— tt, tt\ 



□ 



We now focus on the proof of the concentration inequality (5.1) which is divided in two steps: first 
we show the uniform convergence in probability of F^n to F^ p and then, we deduce the convergence 
of their argmins by using the lower bound given by the function p. Equation (3.1) implies that 



2 sup 



F^(9) - F, p (9) 



< 2tt sup 



^W-^MpW +2*^(0) -1^(0) 



d9 



< 4tt 2 \m(p, p ) - m(fip) \ + 2\m 2 (p p ) - m 2 (pp)\ 

\p p ([-TT,9[)-p;([-TT,9[ 



+ 4tt sup 

9e[— 7r,7rl 



(5.2) 
(5.3) 



where m 2 {y) = f R t 2 dv(t), for a measure v on R. The term (5.3) can be controlled in proba- 
bility using the Dvoretzky-Kiefer-Wolfowitz inequality (see e.g [15]), and we have for all x > 0, 

P^47r 2 sup ee[ „ 7ri7r[ |^p([-7r,6l[) -^([-tt,9[)\ > 4vr 2 yf) < 2e~ x . For the terms of (5.2) which in- 
volve the first and second moment of p p and /i™, we use an Hoeffding type inequality which gives 
for all x > 0, f(^7t 2 \m(p p ) - m(pp)\ + 2 \m 2 (p p ) - m 2 (pp)\ > s(4tt 2 + 2s)y^ \ < 2e~ x , where 

s = max{|x — y\ , x,y G support p} is the diameter of the support of p. Combining these two 
concentration inequalities we have for all x > 0, 



P 2 sup 



F Pp (9) - F^ (9) > (4vr 2 + 4tt 2 s + 2s) 



< 2e 



(5.4) 



We now use a classical inequality in M-estimation, \F flp (9 p > 
F fMp (9)\. By Lemma 5.2, there exists an increasing function p : R + — > R^ 
F Pp (0 p *) > p(d§i(9 p *,0p*)). Plugging this in equation (5.4) we have, 

v(p(cki(e p * n ,e p *)) > (4tt 2 + Att 2 s + 2s)^J <2e~ x , 
and the proof of Proposition 5.1 is completed. 



F„ P (9 P *)\ <2su Peeh7TM \F^(9)- 
R+ — > R+ such that F M (9 K ) - 



□ 



The function p that appears in the statement of Proposition 5.1 can be explicitly computed if the 
density / G P(a, ip). The parameter a G]0, 1] can be interpreted as a measure of the convexity of Fp_ p 
on the interval [— <p>, ip\. For example, if a = 1 and if = <p a = \, then p has its support contained in 
[ — f ' § ] an d F^ is quadratic on [—§,§]■ 

Proposition 5.2. Let p be a probability measure with density f : S 1 — > R + satisfying the hypothesis 
of Theorem Note p* the Frechet mean of p and for all x > we have 



dM, P n>VB(c^){-y <2e 



where B(a, (p) = C max j , f j with 7(0, <p) = \{a.{ir - ip) 2 - if 2 ) and C = 47r(27r 2 +tt + 1). 
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Proof. This result is a direct consequence of Proposition 5.1 and we only have to find a strictly 
increasing function p : [0, it] — > M + satisfying for all 9 G [— tt, 7t[, F^p) — F )1 (p*) > p(d s i(p,p*)). As 
p p * admits a density f p *, the Frechet functional » is twice differentiable and / G -P(p*, a, see 
proof of Theorem 4.2. Thus, for all 9 G [—ir + (p a ,(p a — ir],& second order Taylor expansion of „ at 
ensures that for some 9 £ [—tt + ip a , it — ip a ], 

F^ (9) - F„ (0) = \e 2 ^- 2 F, p , (9) > V. 
For all 9 £ [— tt, — (p a [u]ip a , tt[ we have by inequality (4.4), 

F^ (9) - F^ (0) > - Va f - vl) = 7 (a, > 0. 

Then, let p{t) = t 2 mm{^^, f } > t 2 min{^^, f } for any <p < <p a . □ 
5.3 Computation 

Computation of the Frechet mean of a general probability measure may not be an easy task as it is a 
global optimization problem. In practice the Frechet functional is not a convex function and a gradient 
descent algorithm will only give a local minimum which depends on the initialization point. 




(a) (b) (c) 

Figure 4: (a) and (b) Plots of F^n where n = 10 and p n = Yl?=i &Xi where the Xi i.i.d of uniform law 
A. The red points are the local minima computed with the described algorithm, (c) In blue: plot of 
Fun where n = 4. In green: the derivative of F^n . The critical points are given by the intersection 
between the green curve and the x-axis in black. 



The Frechet functional of an empirical measure is a continuous piecewise quadratic function as it 
can be written F^n (0) = i £?=i (0-^ i + 2^(l^ [ (^)l [ _^ o[ (0)-l [ _ 7r ^ [ (^)l [Oi7r[ (^))) 2 . This 
formula together with Corollary 3.1 implies that the regular critical points (i.e point with no mass 
at their cut locus) of F^ are precisely the local minima of Fn- Moreover, the cumulative distribution 
function of p n is, here, piecewise constant with jumps of size ^ and we have 

n 

= -E 1 l-^x l ) = -Card{^ < t}. 

i=l 

Thence, the derivative of the empirical Frechet functional is piecewise linear and to find the critical 
points amounts to solve n affine relations given by equation (3.1), see also Figure 4(c) for an illustration. 
Note, that in practice, there are less than n solutions, see Figure 4(a) and 4(b). 

The following algorithm takes as input the values {^i}™ = i an d returns the Frechet mean of p n = 



12 



Initialization Step: Choose an arbitrarily point p 6 S 1 . 

Compute the coordinates {9 X }f = i and reorder them is increasing order. We denote Tq = —tt < < 
t~2 < • • • < t~ x < = t~ , i the n\ negative sorted terms and Tq~ = tt > t± > . . . > r+ 2 > = +1 
the ri2 = n — n\ positive sorted terms. 

Compute the mean m(/j,p) = ^(r-f + . . . + t~ x + t± + . . . + r+ 2 ) and initialize 9 P * to 0, says. 

# The first step compares all the local minima in [0, tt[ 
Step 1: For i from to n\ do 

# 9 P , new is the candidate to be a critical point between and r i ^ 1 

Let 9 p p>new = 2^ + m(^) 

# verify if 0?* new is a critical point and then test its value. If better, keep it. 

if t~ + tt < e^ new <Tr +1 +ir and ^»(^ >e J < F^(9 p p ») then 0* := 9 p p ^ new end if 
end for. 

# The Step 2 is the same as Step 1 but for local minima in [— ir,0[ 
Step 2: For i = to ni do 

Let^ jnew = -27ri+m( M p 

if r+ ! - vr < 0^ new < r+ - vr and F„n(e p p * tnew ) < F^) then ^ := 6 p p ^ new end if 
end for. 

# The value of 9 P * is the best argmin 
Output: Return p* = e p (6 p ,). 

This algorithm can be extended to more general measures that the empirical one. The approach 
will be the same: find the critical points of the Frechet functional with formula 3.1 and compare 
the values of the local minima. Unfortunately, there may be some computational issues as general 
cumulative distribution function will be not piecewise constant anymore. 

6 Conclusion 

It is not straightforward to extend criterion such as the one given in Theorem 4.1 to more general 
spaces, e.g. for the n dimensional sphere S" '. Recall that the circle S 1 is a flat space in the sense that 
it is locally isometric to the Euclidean space 1R. Then, the only phenomenon that induces uniqueness 
issues of the Frechet mean is the presence of a cut locus. The criterion presented in this note relies on 
an explicit formula for the gradient of the Frechet mean. Curvature has an extra effect on the metric 
and makes difficult to derive exact computation on the Frechet functional and its gradient. Moreover, 
it is not clear if the role played by the uniform measure as a benchmark in the well definiteness of the 
Frechet mean in S 1 can be extended to n-spheres or non flat manifolds. 
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